Minimal Nondeterministic Finite Automata and 
Atoms of Regular Languages * 



Janusz Brzozowski 1 and Hellis Tamm 2 

David R. Cheriton School of Computer Science, University of Waterloo, 
Waterloo, ON, Canada N2L 3G1 
{brzozo@uwaterloo . ca} 
2 Institute of Cybernetics, Tallinn University of Technology, 
Akadeemia tee 21, 12618 Tallinn, Estonia 
{hellisOcs . ioc . ee} 



Abstract. We examine the NFA minimization problem in terms of 
atomic NFA's, that is, NFA's in which the right language of every state is 
a union of atoms, where the atoms of a regular language are non-empty 
intersections of complemented and uncomplemented left quotients of the 
language. We characterize all reduced atomic NFA's of a given language, 
that is, those NFA's that have no equivalent states. Using atomic NFA's, 
we formalize Sengoku's approach to NFA minimization and prove that his 
method fails to find all minimal NFA's. We also formulate the Kameda- 
Weiner NFA minimization in terms of quotients and atoms. 

Keywords: regular language, quotient, atom, atomic NFA, minimal 
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1 Introduction 

Nondeterministic finite automata (NFA's) have played a major role in the the- 
ory of finite automata and regular expressions and their applications ever since 
their introduction in 1959 by Rabin and Scott [10]. In particular, the intriguing 
problem of finding NFA's with the minimal number of states has received much 
attention. The problem was first stated by Ott and Feinstein [8] in 1961. Various 
approaches have then been used over the years in attempts to answer this ques- 
tion; we mention a few examples here. In 1970, Kameda and Weiner [6] studied 
this problem using matrices related to the states of the minimal deterministic 
finite automata (DFA's) for a given language and its reverse. In 1992, Arnold, 
Dicky, and Nivat [1] used a "canonical" NFA. In the same year, Sengoku [11] used 
"normal" NFA's and "standard formed" NFA's. In 1995, Matz and Potthoff [7] 
returned to the "canonical" automaton and introduced the "fundamental" au- 
tomaton. In 2003, Hie and Yu [5] applied equivalence relations. In 2005, Polak [9] 
used the "universal" automaton. 
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Our approach is to use the recently introduced atoms and atomic languages [3] 
for this question; we briefly state some of their basic properties here. 

The (left) quotient of a regular language L over an alphabet S by a word 
w G S* is the language w^ 1 L = {x G S* \ wx G L}. It is well known that 
the number of states in the complete minimal deterministic hnite automaton 
recognizing L is precisely the number of distinct quotients of L. Also, L is its 
own quotient by the empty word e, that is e^ 1 L = L. A quotient DFA is a 
DFA uniquely determined by a regular language; its states correspond to left 
quotients. The quotient DFA is isomorphic to the minimal DFA. 

An atom 3 of a regular language L with quotients Kq, . . . , K n _\ is any non- 
empty language of the form K C\ - ■ -f\K n -\, where K t is either Ki or Ki, and Ki 
is the complement of Ki with respect to S* . If the intersection with all quotients 
complemented is non-empty, then it constitutes the negative atom; all the other 
atoms are positive. Let the number of atoms be to, and let the number of positive 
atoms be p. Thus, if the negative atom is present, p = m — 1; otherwise, p — m. 

So atoms of L are regular languages uniquely determined by L. They are 
pairwise disjoint and define a partition of S* . Every quotient of L (including L 
itself) is a union of atoms, and every quotient of an atom is a union of atoms. 
Thus the atoms of a regular language are its basic building blocks. Also, L defines 
the same atoms as L. The dtomaton is an NFA uniquely determined by a regular 
language; its states correspond to atoms. An NFA is atomic if the right language 
of every state is a union of atoms. 

Our contributions are as follows: 

1. We characterize all trim reduced atomic NFA's of a given language, where 
an NFA is reduced if it has no equivalent states. 

2. We show that, if no is the minimal number of states of any NFA of a language, 
then the language may have trim reduced atomic NFA's with as few as no 
states, and as many as 2 P — 1 states. 

3. We demonstrate that the number of atomic minimal NFA's can be as low 
as 1, or very high. For example, the language E*abS* with 3 quotients has 
281 atomic minimal NFA's, and additional non-atomic ones. 

4. We formalize the work of Sengoku [11] in our framework. He had no concept 
of atoms, but used an NFA equivalent to the atomaton and NFA's equivalent 
to atomic NFA's. Our use of atoms significantly clarifies Sengoku's method. 

5. We prove that Sengoku's claim that an NFA can be made atomic by adding 
transitions and without changing the number of states is false. We show 
that there exist languages for which the minimal NFA's are all non-atomic. 
So Sengoku's claim that his method can always find a minimal NFA is also 
incorrect. 

6. We formulate the Kameda-Weiner NFA minimization method [6] in terms of 
quotients and atoms. 

3 The definition in [3] does not consider the intersection of all the complemented 
quotients to be an atom. Our new definition in [4] adds symmetry to the theory. 
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In Section 2 we recall some properties of automata and atomata. Atomic 
NFA's are then presented in Section 3. Sengoku's method is studied in Section 4, 
and the Kameda-Weiner method, in Section 5. Section 6 concludes the paper. 

2 Automata and Atomata of Regular Languages 

A nondeterministic finite automaton (NFA) is a quintuple 91 = (Q,E,r],I,F), 
where Q is a finite, non-empty set of states, E is a finite non-empty alphabet, 
r\ : Q x E — > 2® is the transition function, I C Q is the set of initial states, and 
F C Q is the set of final states. As usual, we extend the transition function to 
functions rf : Q x E* -> 2 Q , and n" : 2 Q x E* — > 2 Q , but use n for all three. 

The language accepted by an NFA 91 is L(91) = {w G E* \ n(I, w) C\ F ^ 0}. 
Two NFA's are equivalent if they accept the same language. The right language 
of a state q is i (j! i?(91) = {w G E* \ i](q, w) ("1 F ^ 0}. The right language of a 
set S of states of 91 is L s ,f(91) = U ge s ^H^); 80 = £/,f(91). A state 

is empty if its right language is empty. Two states are equivalent if their right 
languages are equal. An NFA is reduced if it has no equivalent states. The left 
language of a state q is Lj_ q = {w G S* \ q G rj(I,w)}. A state is unreachable 
if its left language is empty. An NFA is trim if it has no empty or unreachable 
states. An NFA is minimal if it has the minimal number of states among all the 
equivalent NFA's. 

A deterministic finite automaton (DFA) is a quintuple D — (Q, £ ,5,qo, F), 
where Q, U, and F are as in an NFA, 5 : Q x E — > Q is the transition function, 
and qo is the initial state. 

We use the following operations on automata: 

1. The determinization operation E> applied to an NFA 91 yields a DFA 91 D 
obtained by the subset construction, where only subsets reachable from the initial 
subset of 91 D are used, and the empty subset, if present, is included. 

2. The reversal operation M applied to NFA 91 yields an NFA 91 R , where the 
sets of initial and final states are interchanged and all transitions are reversed. 

3. The trimming operation T applied to an NFA deletes all unreachable and 
empty states. 

The following theorem is from [2] , and was also discussed in [3] : 

Theorem 1 (Determinization). If® is a DFA accepting a language L, then 
D m is a minimal DFA for L R . 

Let L be any non-empty regular language, and let its set of quotients be 
JC = {Kq, ■ ■ ■ ,K n -i}. One of the quotients of L is L itself; this is called the 
initial quotient and is denoted by Ki n . A quotient is final if it contains the 
empty word e. The set of final quotients is T = {Ki \ e G Ki). 

In the following definition we use a 1-1 correspondence Ki <H> Ki between 
quotients Ki of a language L and the states of the quotient DFA D defined 
below. We refer to the Kj as quotient symbols. 
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Definition 1. The quotient DFA of L is D = (K, S, S, K in , F), where K = 
{K , . . . , K n _i}, K in corresponds to K in , F = {Kj \ K t G F], and6(Ki,a) = 
Kj if and only if a~ 1 K i = Kj, for all Kj, Kj G K and a G S. 

In a quotient DFA the right language of Kj is Ki, and its left language is 
{w G S* w~ 1 L — Ki}. The language £(2)) is the right language of Kj„, and 
hence L(D) = L. DFA D is minimal, since all quotients in K are distinct. 

It follows from the definition of an atom, that a regular language L has at 
most 2" atoms. An atom is initial if it has L (rather than L) as a term; it is final 
if it contains e. Since L is non-empty, it has at least one quotient containing e. 
Hence it has exactly one final atom, the atom Kq Pi - - • n K n -i, where Ki = Ki 
if s G Ki, and Ki = Ki otherwise. Let A = {Aq, . . . , A m -i} be the set of atoms 
of L. By convention, I is the set of initial atoms, A v -\ is the final atom and 
the negative atom, if present, is A m -\. The negative atom is not reachable from 
I and can never be final, since there must be at least one final quotient in its 
intersection. 

As above, we use a 1-1 correspondence Ai «-> Aj between atoms Ai of a 
language L and the states Aj of the NFA 21 defined below. We refer to the A, 
as atom symbols. 

Definition 2. The atomaton of L is the NFA 21 = (A, S, a, Aj, {A p _i}), where 
A = {Ai | Ai G A}, Aj = {Ai | Ai G 1}, A p _i corresponds to A p -i, and 
Aj G a(Ai, a) if and only if aAj C A i} for all Aj, Aj G A and a G S. 

In the atomaton, the right language of any state Aj is the atom Aj. 

The results from [3] and our definition of atoms in [4] imply that 2l R is a 
minimal DFA that accepts L R . It follows from Theorem 1 that 2l R is isomorphic 
to J) RD . The following result from [4] makes this isomorphism precise: 

Theorem 2 (Isomorphism). Let S be the collection of all subsets of the set 
K of quotient symbols. Let (p : A — > S be the mapping assigning to state Aj, 
corresponding to Aj = K io fl ■ • • (~1 K in _ r _ 1 n K in _ T D - - • (~1 Ki n _ 1 o/2l R , the set 
{K J0 , . . . ,K in _ r _ 1 }. Then ip is a DFA isomorphism between 2l R and £> RD . 

Corollary 1. The mapping ip is an NFA isomorphism between 21 and J) RDR . 
3 Atomic NFA's 

A new class of NFA's was defined in [3] as follows: 

Definition 3. An NFA 91 = (Q,U,rj,I,F) is atomic if for every q G Q, the 
right language Lq^i^) of q is a union of some positive atoms o/L(9T). 

The following theorem, slightly restated, was proved in [3]: 
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Table 1. 91a. 



Table 2. 



Table 3. 9t c . 
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{0} 
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Theorem 3 (Atomicity). A trim NFA 9T is atomic if and only if 9t RD is 
minimal. 

This theorem allows us to test whether an NFA 9t accepting a language L 
is atomic. To do this, reverse 91 and apply the subset construction. Then 9T is 
atomic if and only if 0T RD is isomorphic to the minimal DFA of L R . 

All three possibilities for the atomic nature of 01 and 9T R exist: NFA 9t a of 
Table 1 and its reverse are not atomic. NFA *Jtb of Table 2 is atomic, but its 
reverse is not. NFA 9T C of Table 3 and its reverse are both atomic. Note that all 
three of these NFA's are equivalent, and they accept S*abS*. 

If we allow equivalent states, there is an infinite number of atomic NFA's, 
but their behaviours are not distinct; hence we consider only reduced NFA's. 
Suppose 25 = (B,S,(3,Bi,Bf) is any trim reduced atomic NFA accepting L. 
Since 23 is atomic, the right language of any state in 23 is a union of positive 
atoms of L; hence the states of 23 can be represented by sets of positive atom 
symbols. Because 23 is trim, it does not have a state with the empty set of atom 
symbols. Since 23 is reduced, no set of atom symbols appears twice. Thus the 
state set B is a collection of non-empty sets of positive atom symbols. 

Theorem 4 (Legality). Suppose L is a regular language, its dtomaton is 21 = 
(A, S, a, A/, {A p _i}) ; and 23 = (B, E, fi,Bi,B F ) is a trim NFA, where B = 
{Bi, . . . , B r } is a collection of sets of positive atom symbols and Bj,Bp C B. If 
Bi C B, define S(Bi) — {J B , eB . Bi to be the set of atom symbols appearing in 
the sets Bi of Bi . Then 23 is a reduced atomic NFA of L if and only if it satisfies 
the following conditions: 

1. S{Bj) = Aj. 

2. For all B t G B, S{(3(B t ,a)) = a{B u a). 

3. For all Bi 6 B, we have B t £ Bf if and only if A p _i £ Bi. 

Before proving the theorem, we require the following lemma: 

Lemma 1. 7/23 satisfies Condition 2 of Theorem 4, then S((3(Bi,w)) = a(Bi, w) 
for every Bi £ B and w £ S* . 

Proof. For w = e, we have S((3(Bi,e)) = S(Bi) = Bi, and a(Bi,e) — B i: so 
the claim holds for this case. 

Assume that S(f3(Bi,w)) = a(Bi,w) for all Bi £ B and all w £ S* with 
length less than or equal to I ^ 0. We prove that S((3(Bi, wa)) = a(B i} wa) for 
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every a G E. Let fi{Bi, w) = {B il , . . . , B ih } for some B il , . . . , B ih G Since 
wa) = j3(/3(Bi,w), a) = P(B n , a)U- • -Up(B ih , a), we have S(l3{B t , wa)) = 
S{!3{B tl , a) U • • • U/3(B ih , a)) = S((3(B n , a)) U • • • U S(/3(B 4h , a)) . By Condition 2, 
the latter is equal to a(B il ,a) U ••• U a(B ih ,a) = a(B il U ••• U B ihl a) = 
a(S((3(Bi,w)),a). By the inductive assumption, we get a(S((3(Bi,w)),a) = 
a(a(Bi,w), a) — a(Bi, wa), which proves our claim. □ 

Proof of Theorem 4 

Proof. First we prove that any NFA 03 satisfying Conditions 1-3 is an atomic 
NFA of L. Let B t G B be a state of 03. If w G L BiiBF ( f B), then by Con- 
dition 3, there exists Bj G (3(B i} w) such that A p _i G Bj, and we have 
A p _i G S(/3(Bi,w)). By Lemma 1, we get A p „i G a(Bi, w), implying that there 
is some A& G Bi such that w G £A fc .{A p _i}(20- Conversely, if w G £A fc ,{A p _i}(2Q 
and Afc G Bj, then A p _i G a(Bi,w) = S((3(Bi,w)). Hence there exists Bj G 
f3(Bi,w) such that A p _i G Bj. Consequently, every word accepted in 23 from 
state Bi is in some atom Ak such that A& G Bi, and every word in an atom Ak 
such that Afc G B i} is also in £.^,^(03). Therefore the right language of Bi in 
03 is equal to the union of atoms Ak such that A& G Bi. In particular, L^j.Bf (23) 
is the union of atoms whose atom symbols appear in the initial collection of 03 
which, by Condition 1, is the same as the union of atoms whose atom symbols 
are initial in 21. But that last union is precisely £aj,{a p _i} (21) = L. Since any 
two sets Bi and Bj are different, and atoms are disjoint, 03 is reduced. Hence 
03 is a reduced atomic NFA of L. 

Conversely, we show that if 03 is a reduced atomic NFA of L, then it must 
satisfy Conditions 1-3. So in the following we assume that 03 is atomic, that is, 
for every state Bi of 03, the right language of Bi is equal to the union of atoms 
Ak such that A^. G Bi. 

First, we show that Condition 1 holds. Let Aj G S(Bi). Then there is a state 
Bj G Bi such that Aj G B r So for any w G Aj, w € £(03). Since L(03) = L(2l), 
we have w G £(21) for all w G Aj. Thus Aj G Aj. Conversely, if Aj G Aj, then 
for all w G Aj, w G £(21) = £(03). Since 03 is atomic, there is an initial state Bj 
such that A 3 C Lb^Bf^)- Hence Aj G S(Bi). 

Next, we prove Condition 2. If Aj G S(f3(Bi, a)), then £Bi,B F (03) must 
contain aAj. So there must exist some Ai G Bi such that aAj C Aj. Thus 
Aj G a(£?i,a). Conversely, if Aj G a(Bi,a), then there is an atom A^ G Bi 
such that Aj G a(Aj,a), implying aAj C Aj. Since A^ G Bi, £^,^(03) must 
contain aAj. Hence Aj G S((3(Bi,a)). 

To show that Condition 3 holds, we first suppose that Bi G Bp- Then e is 
in the right language of Bi. Since 03 is atomic, e must be in one of the atoms 
of Bi. However, the only atom containing e is A p _i, so A p _i G Bi. Conversely, 
if A p _i G Bi, then e is in the right language of Bi, and Bi is a final state by 
definition of an NFA. □ 

Example 1. Consider the trim atomaton 2I T of Table 4 and the atomic NFA 03 
of Table 5. Here B = {B , B 1 , B 2 }, where B = {A , Ai}, B 1 = {A 2 }, and 
B 2 — {A ,A 2 }. The initial collection is Bi = {B } = {{A ,Ai}}, and the 
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Table 4. Atomaton 2l T . 



Table 5. Atomic NFA 93. 
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A 


{Ao.Ai} 


{A„,A 2 } 




Ai 


{A 2 } 






A 2 
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& 




{Ao, Ai} 


{{A ,A 1 },{A 2 }} 


{{Ao,A 2 }} 


•<— 


{A 2 } 






«- 


{A ,A 2 } 


{{Ao.Ai}} 


{{Ao,A 2 }} 



final collection is Bf = {-61,-62} = {{A2}, {A , A 2 }}. One verifies that all the 
conditions of Theorem 4 hold, and NFA's 2l T and 03 are equivalent. ■ 

The number of trim reduced atomic NFA's can be very large. There can 
be such NFA's with as many as 2 P — 1 non-empty states, since there are that 
many non-empty sets of positive atoms. However, in a general case, not all sets 
of positive atom symbols can be states of an atomic NFA. The largest reduced 
atomic NFA is characterized in the following theorem. 

Theorem 5 (Maximal atomic NFA). IfB is the collection of all sets Bi such 
that Bi is a non-empty subset of the set of positive atom symbols {Ah | Ah C Kj} 
of any quotient Kj of L, then there exists a trim reduced atomic NFA of L with 
state set B. 

Proof. Let 03 = (B,E,(3,Bi,B F ) be an NFA in which the state set B is the 
collection of all sets Bi such that Bi is a non-empty subset of the set of atom 
symbols {A^ | Ah C Kj} of any quotient Kj of L, where j G {0, . . . ,n — 1}, 
(3{Bi,a) = {Bj I Bj C a(Bi,a)} for every Bi G B and a G E, B { G Bj if and 
only if Bi is a subset of the set of atom symbols of the initial quotient Ki n , and 
Bi 6 Bf if and only if A p _i G Bi. We claim that 03 is a trim reduced atomic 
NFA of L. 

First, we show that 03 is trim. Let us consider any state Bi of 03. Let Kj be a 
quotient such that B t is a subset of the set of atom symbols of Kj , and let Bj be 
the set of atom symbols corresponding to Kj . Let 6 be the set of atom symbols 
corresponding to the initial quotient Ki„ of L. Note that 60 = A/. Since every 
set of atom symbols corresponding to some quotient is reachable from the initial 
set of atom symbols in the atomaton 21, there must be a word w G £*, such 
that Bj is reachable from Bq by w in 21. We show that Bi is reachable from 
some initial state of 58 by w. If w = e, then Kj = K in , and since B t C Bj, it 
follows that Bi is an initial state of 03 reachable from itself by e. If w = ua for 
some u G S* and a G S, then there is a state B u of 03, reachable from 60 by u, 
such that B u corresponds to the quotient u~ 1 L of L and Bj = a{B Ul a). Since 
Bi C Bj and Bj = a(B u ,a), by the definition of (5 we have Bi G (3(B u ,a). 
Thus, Bi is reachable from 60 in 03 by ua. 

We also have to show that there is a word w G £*, such that some final state 
of 03 is reachable from Bi by w. If Bi is final, then it is reachable from itself 
by w — e. If Bi is not final, then let us consider any A^ G Bi. Since the right 
language of the state A^ in the atomaton 21 is not empty, and A& cannot be 
the final state of 21, there must be some state A; of 21 and some a G E, such 
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Now we know that there is some Bj such that A; £ Bj and 



j j . Since (3(Bi,a) is the collection of all non-empty subsets of Bj, 



it 



that A; e a(Ak,a 
a{Bi,a) = B 

follows that {A;} e (3(Bi,a). Since the final state A p _i of 2t is reachable from 
A; by any word v £ Ai, we get {A p _i} £ (3(Bi, av) by the definition of (3. So a 
final state {A p _i} of 23 is reachable from Bi by av. Thus, 23 is trim. 

To see that 03 is a reduced atomic NFA, one verifies that Conditions 1-3 of 
Theorem 4 hold. Thus by Theorem 4, 23 is a trim reduced atomic NFA of L. □ 

Theorem 6 (NFA with 2 P — 1 states). A regular language L has a trim 
reduced atomic NFA with 2 P — 1 states if and only if for some quotient Ki of L, 
K i = A U---UA p _ 1 . 

Proof. Let 23 = (B, E, (3, Bj, B F ) be a trim reduced atomic NFA of L with 2^-1 
states. Then there must be a state Bi of 23 such that Bi = {A , . . . , A p _i}. Since 
the right language of any state of a trim NFA is a subset of some quotient, we 
have Lb^Bf (®) — U • • • U A v -\ C Ki for some quotient Ki of L. On the other 
hand, Ki must be a union of some positive atoms, so we get Ki = AoU- ■ -UA p _i. 

Conversely, let K t = Aq U • ■ • U A p ^i be a quotient of L which includes all the 
positive atoms of L. Then by Theorem 5, there is a trim reduced atomic NFA 
of L in which the state set is the collection of all non-empty subsets of the set 
of positive atom symbols. This NFA has 2 P — 1 states. □ 

The construction of reduced atomic NFA's is illustrated in the following 
example. To simplify the notation, we do not use atom symbols in examples. 

Example 2. Consider the minimal DFA D taken from [6] and shown in Table 6. 
It accepts the language L = S*(bUaa)Ua, and its quotients are Kq = e^ 1 L = L, 



l L = r(i)Uaa)UaU£, and K 2 = b~ x L = S* (b U aa) U e . NFA D 



and the isomorphic trim atomaton 2l T with states renamed are shown in Tables 7 
and 8. The positive atoms are A = £*(bUaa), B = a and C = e, and Ko — AuB, 
K 1 = A U B U C, and K 2 = A U C. 

Since the set {A, B} of initial atoms does not contain all positive atoms, no 
1-state NFA exists. 

1. For the initial state we could pick one state {A, B} with two atoms. From 
there, the atomaton reaches {A, B, C} under a, and {A, C} under b. 
(a) If we pick {A, C} as the second state, we can cover {A, B, C} by {A, B} 
and {A, C}, as in Table 9. Here the minimal atomic NFA is unique. 



Table 6. D. 



Table 7. S B 



Table 8. a 1 
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«- 


2 





2 







a 


b 


«- 


12 
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{12} 




-> 


012 


{012,01} 


{012, 12} 
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-¥ 
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{C} 




-> 
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{A,B} 


{A,C} 
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Table 9. NFA 03 1. 
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b 




{A,B} 


{A,B},{A,C} 


{A,C} 




{AC} 


{A,B} 


{A,C} 



Table 11. A 5-state NFA. 
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{A},{B} 


{AC} 
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{AC} 


{A,B} 


{AC} 


<— 


{C} 








{A,B} 


{A,B},{C} 


{A},{C} 



Table 10. Atomic NFA Q3 2 . 
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{A,B} 


{A,B},{C} 


{AC} 




{C} 








{AC} 


{A,B} 


{AC} 



Table 12. A 7-statc NFA. 
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b 


—¥ 
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{A},{B} 


{AC} 
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{C} 






{AC} 


{A B} 


{AC} 




{C} 






-> 


RB} 


{A,B,C},{B,C} 


{AC} 


<— 


{A,B,C} 


{A,B,C},{B,C} 


{AC} 




{B,C} 


{C} 





(b) Wc can also use {A, B, C} as a state. Then we need {A, C} for the 
transition under b. This gives an NFA isomorphic to the DFA of Table 6. 

(c) We can use state {C} as shown in Table 10. 

2. We can pick two initial states, {^4} and {B}. 

(a) If we add {C}, this leads to the atomaton of Table 8. 

(b) A 5-state solution is shown in Table 11. 

3. We can use three initial states, {A}, {B} and {A, B}. A 7-statc NFA is 
shown in Table 12. This is a largest possible reduced solution. ■ 

The number of minimal atomic NFA's can also be very large. 

Example 3. Let £ — {a,b} and consider the language L = S*aS*bS* = 
S*abS*. The quotients of L are K = L, K x = L U bS* and K 2 = £*. The 
quotient DFA of L is shown in Table 13, and its atomaton, in Tables 14 and 15 
(where the atoms have been relabelled). The atoms are A — L, B — 6* 6a* and 
C = a*, and there is no negative atom. Thus the quotients are Kq = L = A, 
K x = A U B, and K 2 = A U B U C. 

We find all the minimal atomic NFA's of L. Obviously, there is no 1-state 
solution. The states of any atomic NFA are sets of atoms, and there are seven 
non-empty sets of atoms to choose from. Since there is only one initial atom, 
there is no choice: we must take {A}. For the transition (A, a, {A, B}), we can 
add {B} or {A, B}. If there are only two states, atom {C} cannot be reached. 
So there is no 2-state atomic NFA. The results for 3-state atomic NFA's are 
summarized in Proposition 1. 

Proposition 1. The language S*abS* has 281 minimal atomic NFA's. 
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Table 14. Atomaton 21. 
Table 13. DFA D. Table 15. 21 relabelled. 







a 


6 


— > 





1 







1 


1 


2 


<— 


2 


2 


2 







a 


6 


<- 


2 


{2} 






12 




{12,2} 


-¥ 


012 


{012, 12} 


[012} 







a 


b 




c 


{C} 






s 




{B,C} 


— > 




{A, 5} 


{A} 



Table 16. NFA 91 2 . 



Table 17. NFA 91 9 . 







a 


b 






AB 


A 




AS 


AB 


AB,C 




c 


C 









a 


b 




A 


A, AB 


A 




AB 


A, AB 


A,AB,C 


4- 


C 


C 





Proof. We concentrate on 3-state solutions. We drop the curly brackets and 
commas and represent sets of atoms by words. Thus {A, AB, BC} stands for 
{{A},{A,B},{B,C}}. 

State A is the only initial state and so it must be included. To implement 
the transition (A, a, {A, B}) from 21, cither B or AB must be chosen. 

1. If B is chosen, then there must be a set containing C but not A; otherwise 
the transition (B, b, {B, C}) cannot be realized. 

(a) If BC is taken, then C must be taken, and this would make four states. 

(b) Hence C must be chosen, giving states A, B, and C. This yields the 
atomaton 21 = 9ti . 

2. If AB is chosen, then we could choose C, AC or ABC, since BC would also 
require C. Thus there are three cases: 

(a) {A, AB, C} yields 0^2 of Table 16, if the minimal number of transitions is 
used. The following transitions can also be added: (A, a, A), (AB,a,A), 
{AB, b, A). Since these can be added independently, we have eight more 
NFA's. Using the maximal number of transitions, we get Dig of Table 17. 

(b) {A, AB, AC} results in Olio with the minimal number of transitions, and 
9T25 with the maximal one. 

(c) {A, AB, ABC} results in 9126 (the quotient DFA) with the minimal num- 
ber of transitions, and 91281 with the maximal one. 



Table 18. NFA % 



Table 19. NFA 01 25 - 







a 


b 


— ¥ 


A 


AB 


A 




AB 


AB 


AB, AC 




AC 


AB, AC 


A 







a 


b 


—¥ 


A 


A, AB 


A 




AB 


A, AB 


A, AB, AC 




AC 


A, AB, AC 


A 
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Table 20. NFA 91 



Table 21. NFA 9l 2 8i- 







a 


6 


— >• 




AB 






AB 


AB 


ABC 


«- 


ABC 


ABC 


ABC 







a 


6 


— > 


A 


A, AB 


A 




AB 


A, AB 


A,AB, ABC 




ABC 


A, AB, ABC 


A, AB, ABC 



Table 22. NFA 91 2 82. 







a 


6 


-¥ 





1 







1 


1 


0,1,2 


^~ 


2 


0,2 





As well, L has 3-state non-atomic NFA's. The determinized version of NFA 
Vlio of Table 18 is not minimal. By Theorem 3, 9lf is not atomic. But L R = 
S*baS*; hence we obtain a non-atomic 3-state NFA for L by reversing 91io and 
interchanging a and b. That NFA with renamed states is shown in Table 22. 

The right languages of the states of 91282 arc: Lq = L = A, L\ = A U B, 
and L2 = e U a U aaS* U abb* aa*bS* , which is not a union of atoms. Six more 
non-atomic NFA's can be derived from NFA's between 9lio and 0^25- □ 

This is a rather large number of NFA's for a language with 3 quotients. ■ 

One can verify that there is no NFA with fewer than 3 states which accepts 
the language L = E*abE* . This implies that every minimal atomic NFA of L is 
also a minimal NFA of L. However, this is not the case with all regular languages, 
as we will see in the next section. 



4 Sengoku's NFA Minimization Method 

Sengoku had no concept of atom, but he came very close to discovering it. 
For a language accepted by a minimal DFA S3, the normal NFA [11] (p. 18) 
is isomorphic to J)™* 1 ^ anc l hence to the trim atomaton, by our Corollary 1. 
Moreover, he defines an NFA 91 to be in standard form [11] (p. 19) if 91 RD is 
minimal. By our Theorem 3, such an 91 is atomic. Sengoku makes the following 
claim [11] (p. 20): 

We can transform the nondeterministic automaton into its standard form 
by adding some extra transitions to the automaton. Therefore the number 
of states is unchangeable. 

This claim amounts to stating that any NFA can be transformed to an equivalent 
atomic NFA by adding some transitions. Unfortunately, the claim is false: 
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Theorem 7. There exists a language for which no minimal NFA is atomic. 

Proof. This example is from [7]. A quotient DFA 23, the NFA S3 RDR , and its 
isomorphic atomaton 21 with relabelled states are in Tables 23-25, respectively 
(there is no negative atom). We now drop the curly brackets and commas in 
tables, and represent sets of atoms by words. A minimal NFA *Jt m i„ of this 
language, having four states, is shown in Table 26; it is not atomic and it is not 
unique. We try to construct a 4-state atomic NFA diatom equivalent to S3. 

Table 23. 2). 







a 


b 


— > 





1 


2 




1 


3 


4 


<— 


2 


5 


4 




3 


3 


1 




4 


6 


2 


«- 


5 


7 


2 




6 


3 


8 


<— 


7 


7 


7 




8 


6 


7 



Table 24. D™. 







a 


b 


<— 


257 


257, 04578 




— > 


04578 


12678 


257 




12678 




04578, 03 -8 


— > 


03-8 




12678 




1-8 


03-8 




-> 


0-8 


1-8,0-8 


1-8,0-8 



Table 25. a. 







a 


b 






AB 




— > 


B 


C 


A 




C 




BD 


— > 


D 




C 




E 


D 




-> 


F 


EF 


EF 



First, we note that quotients corresponding to the states of S3 can be ex- 
pressed as sets of atoms as follows: Kq = {B,D,F}, K\ = {C,E,F}, Ki = 
{A,C,E,F}, K 3 = {D,E,F}, K± = {B, D, E, F}, K 5 = {A, B, D, E, F}, 
K 6 = {C,D,E,F}, K 7 = {A,B,C,D,E,F}, and K$ = {B,C,D,E,F}. One 
can verify that these are the states of the determinized version of the atomaton, 
which is isomorphic to the original DFA S3. Now, every state of *3I a i OTrl must be 
a subset of a set of atoms of some quotient, and all these sets of atoms of quo- 
tients must be covered by the states of *3I a i 0m . We note that quotients {B, D, F}, 
{C, E, F}, and {Z3, E, F} do not contain any other quotients as subsets, while all 
the other quotients do. It is easy to see that there is no combination of three or 
fewer sets of atoms, other than these three sets, that can cover these quotients. 
So we have to use these sets as states of *JT a i 0m . We also need at least one set 
containing the atom A. If we use only one set of atoms with A, that set has to 
be a subset of every quotient having A. So it must be a subset of {A,E,F}. If 
we use {A} as a state, then by the transition table of the atomaton, there must 
be at least one more state to cover {A, B}. Similarly, if we use {A, E}, then 
we must have another state to cover {A, B, D}. If we use {A, F}, then we must 
have a state to cover {A, B, E, F}. And if we use {A, E, F}, then we must have a 
state to cover {E,F}. We conclude that a smallest atomic NFA has at least five 
states. There is a five-state atomic NFA, as shown in Table 27. It is not unique. 

Since there does not exist a four-state atomic NFA equivalent to the DFA 
S3, it is not possible to convert the non-atomic minimal NFA 91 m i„ to an atomic 
NFA by adding transitions. □ 

In summary, Sengoku's method cannot find the minimal NFA's in all cases. 
However, it is able to find all atomic minimal NFA's. His minimization algorithm 
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Table 26. NFA 91 ml 



Table 27. %, 







a 


b 


— > 





1 


1,2 




1 


3 


0,3 


<- 


2 


0,2,3 






3 


3 


1 







a 


b 


— ► 


BDF 


CEF 


CEF, AEF 




CEF 


DEF 


BDF, DEF 




AEF 


BDF, AEF, DEF 


EF 




DEF 


DEF 


CEF 




EF 


DEF 


EF 



proceeds by "merging some states of the normal nondctcrministic automaton. 
This is similar to our search for subsets of atoms that satisfy Theorem 4. 



5 The Kameda-Weiner Minimization Method 

We present a short and modified outline of the properties of the Kameda-Weiner 
NFA minimization method [6] using mostly our terminology and notation. They 
consider a trim minimal DFA 55 — (Q,E,5,qo,F) with Q of cardinality n, and 
its reversed determinized and trim version 5) RDT ; the set of states of 55 RDT is a 
subset S of cardinality p of 2^ \ 0. They then form an n x p matrix T where the 
rows correspond to non-empty states qi £ Q of 55, which is the trim minimal 
DFA of a language L, and columns, to states Sj £ S of 55 RDT , which is the trim 
minimal DFA of the language L R by Theorem 1. The entry tij of the matrix T 
is 1 if Qi £ Sj, and otherwise. 

We use 35 RDRT , the trim atomaton, instead of 5) RDT 7 since the state sets of 
these two automata are identical. Interpret the rows of the matrix as non-empty 
quotients of L and columns, as positive atoms of L. Then ti.j = 1 if and only if 
quotient Ki contains atom Aj , and it is clear that every regular language defines 
a unique such matrix, which we will refer to as the quotient- atom matrix. 

The ordered pair (Ki, Aj) with K j £ JC and Ai £ A is a point of T if tij = 1. 
A grid g of T is the direct product g = P x R of a set P of quotients with a set 
R of atoms. If g = P x R and g' = P' x R 1 arc two grids of T, then g C g' if 
and only if P C P' and R C R'. Thus C is a partial order on the set of all grids 
of T, and a grid is maximal if it is not contained in any other grid. A cover C 
of T is a set C = {go, ■ ■ ■ , gk-i} of grids, such that every point (Ki, Aj) belongs 
to some grid gi in C. A minimal cover has the minimal number of grids. 

Let / : K, — > 2 C \ be the function that assigns to quotient Ki £ JC the set of 
grids g = PxR such that Ki £ P. The NFA constructed by the Kameda-Weiner 
method is 91c = (C, S,7]c,Ci,Cf), where C is a cover consisting of maximal 
grids, Cj = f(Ki n ) is the set of grids corresponding to the initial quotient Ki n , 
and Cf is defined by g £ Cp if and only if g £ f(Ki) implies that Ki is a final 
quotient. For every grid g = P x R and x £ E, we can compute r]c(g, x) by the 
formula r}c(g,x) = C\ Ki€P f(x^ 1 K i ). 

It may be the case that Vic is not equivalent to DFA 55. A cover C is called 
legal if L(%lc ) = L(D). To find a minimal NFA of a language L, the method 
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in [6] tests the covers of the quotient-atom matrix of L in the order of increasing 
size to see if they are legal. The first legal NFA is a minimal one. 

When we apply the Kameda-Weiner method [6] to the example in Theorem 7, 
we get the NFA of Table 26. 

We apply the Kameda-Weiner method [6] to the example in Theorem 7. The 
quotients in the example are referred to as the integers 0-8, as in Table 23. The 
atoms are those in Table 24 relabelled as in Table 25. The quotient- atom matrix 
is shown in Table 28, where the non-blank entries are to be interpreted as l's and 
the blank entries as 0's. Table 28 also shows a minimal cover S = (go, gi, 32, 53) 
and f{Ki) for each quotient Ki of K. 



Table 28. Cover C for quotient-atom matrix of D. 







F 


E 


D 


C 


B 


A 


f(Ki) 


— ¥ 





go 




3o 




go 




{go} 




1 


gi 


3i 




3i 






{31} 


<— 


2 


31,32 


3i,32 




3i 




32 


{31,32} 




3 


33 


33 


33 








{33} 




4 


00,33 


33 


3o,33 




go 




{30,33} 




5 


30,32,33 


32,33 


3o,33 




go 


32 


{30,32,33} 




6 


31,33 


3i,33 


33 


3i 






{31,33} 


<— 


7 


30,31,32,33 


31,32,33 


3o,33 


3i 


go 


32 


{30,31,32,33} 




8 


30,31,33 


3i,33 


3o,33 


3i 


go 




{30,31,33} 



The construction of the NFA 91 m i„ is shown in Table 29. For each grid 
g = P x R, we show its set of quotients P, with Ki e P replaced by i. For 
each input x e S, we give x _1 P, and then the intersection of the f(Ki) for 
Ki E x~ 1 P. For example, the set P for go is expressed as {0, 4, 5, 7, 8}, the set of 
quotients a _1 P of the set P by a is {1, 6, 7}, and ijc(go, a ) = /(l)n/(6) n/(7) = 
{51} H {31,53} n {30,31,32,33} = {31}- Table 26 shows the constructed NFA 
9Tmm, where gi's are replaced by i's. Since 9T m i„ is equivalent to D, C is a legal 
cover. However, 9T m i„ is not atomic, since the right language of state 32 is not 
a union of atoms, although it includes atoms A and E as its subsets. The right 
languages of the other states of 9t m i„ are sets of atoms: L(g ) = B U D U F, 
L(gt) = C U E U F, and £(33) = DUEUF. 

We believe that NFA's defined by grids are a topic for future research. 

6 Conclusions 

We have studied the properties of atomic NFA's. We have shown that atoms 
play an important role in NFA minimization and proved that it is not enough 
to search for atomic NFA's only. 
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Table 29. Construction of NFA % 





s 


p 


a 


a 


b 


b 








a~ L P 


vc(g,a) 


b~ l P 


vc(g,b) 


— > 


30 


{0,4,5,7,8} 


{1,6,7} 


{3i} 


{2,7} 


{31,32} 




ffi 


{1,2,6,7,8} 


{3,5,6,7} 


{33} 


{4, 7,8} 


{30,33} 




32 


{2,5,7} 


{5,7} 


{30,32,33} 


{2,4, 7} 







33 


{3,4,5,6, 7,8} 


{3,6,7} 


{33} 


{1,2,7, 8} 


{3i} 



References 

1. Arnold, A., Dicky, A., Nivat, M.: A note about minimal non-deterministic au- 
tomata. Bull. EATCS 47, 166-169 (1992) 

2. Brzozowski, J.: Canonical regular expressions and minimal state graphs for definite 
events. In: Proc. Symp. on Mathematical Theory of Automata. MRI Symposia 
Series, vol. 12, pp. 529-561. Polytechnic Institute of Brooklyn, N.Y. (1963) 

3. Brzozowski, J., Tamm, H.: Theory of atomata. In: Mauri, C, Leporati, A. (eds.) 
DLT 2011. LNCS, vol. 6795, pp. 105-116. Springer (2011) 

4. Brzozowski, J., Tamm, H.: Quotient complexities of atoms of regular languages. 
In: Yen, H.C., Ibarra, O. (eds.) DLT 2012. LNCS, vol. 7410, pp. 50-61. Springer 
(2012) 

5. Hie, L., Yu, S.: Reducing NFAs by invariant equivalences. Theoret. Comput. Sci. 
306, 373-390 (2003) 

6. Kameda, T., Weiner, P.: On the state minimization of nondeterministic automata. 
IEEE Trans. Comput. C-19(7), 617-627 (1970) 

7. Matz, O., Potthoff, A.: Computing small finite nondeterministic automata. In: 
Engberg, U.H., Larsen, K.G., Skou, A. (eds.) Proc. Workshop on Tools and Al- 
gorithms for Construction and Analysis of Systems, pp. 74-88. BRICS, Aarhus, 
Denmark (1995) 

8. Ott, C, Feinstein, N.: Design of sequential machines from their regular expressions. 
J. ACM 8, 585-600 (1961) 

9. Polak, L.: Minimalizations of NFA using the universal automaton. Internat. J. 
Found. Comput. Sci. 16(5), 999-1010 (2005) 

10. Rabin, M., Scott, D.: Finite automata and their decision problems. IBM J. Res. 
and Dev. 3, 114-129 (1959) 

11. Sengoku, H.: Minimization of nondeterministic finite automata. Master's thesis, 
Kyoto University, Department of Information Science, Kyoto, Japan (1992) 



15 



